amLite: Amharic Transliteration Using Key Map Dictionary
نویسنده
چکیده
amLite is a framework developed to map ASCII transliterated Amharic texts back to the original Amharic letter texts. The aim of such a framework is to make existing Amharic linguistic data consistent and interoperable among researchers. For achieving the objective, a key map dictionary is constructed using the possible ASCII combinations actively in use for transliterating Amharic letters; and a mapping of the combinations to the corresponding Amharic letters is done. The mapping is then used to replace the Amharic linguistic text back to form the original Amharic letters text. The framework indicated 97.7, 99.7 and 98.4 percentage accuracy on converting the three sample random test data. It is; however, possible to improve the accuracy of the framework by adding an exception to the implementation of the algorithm, or by preprocessing the input text prior to conversion. This paper outlined the rationales behind the need for developing the framework and the processes undertaken in the development.
منابع مشابه
An ensemble of transliteration models for information retrieval
Transliteration is used to phonetically translate proper names and technical terms especially from languages in Roman alphabets to languages in non-Roman alphabets such as from English to Korean, Japanese, and Chinese. Because transliterations are usually representative index terms for documents, proper handling of the transliterations is important for an effective information retrieval system....
متن کاملHindi to English and Marathi to English Cross Language Information Retrieval Evaluation
In this paper, we present our Hindi to English and Marathi to English CLIR systems developed as part of our participation in the CLEF 2007 Ad-Hoc Bilingual task. We take a query translation based approach using bi-lingual dictionaries. Query words not found in the dictionary are transliterated using a simple rule based transliteration approach. The resultant transliteration is then compared wit...
متن کاملHindi and Marathi to English Cross Language Information
In this paper, we present our Hindi ->English and Marathi ->English CLIR systems developed as part of our participation in the CLEF 2007 Ad-Hoc Bilingual task. We take a query translation based approach using bi-lingual dictionaries. Query words not found in the dictionary are transliterated using a simple lookup table based transliteration approach. The resultant transliteration is then compar...
متن کاملKorean-Chinese Cross-Language Information Retrieval Based on Extension of Dictionaries and Transliteration
This paper describes our Korean-Chinese cross-language information retrieval system. Our system uses a bi-lingual dictionary to perform query translation. We expand our bilingual dictionary by extracting words and their translations from the Wikipedia site, an online encyclopedia. To resolve the problem of translating Western people’s names into Chinese, we propose a transliteration mapping met...
متن کاملHindi and Marathi to English Cross Language Information Retrieval at CLEF 2007
In this paper, we present our Hindi ->English and Marathi ->English CLIR systems developed as part of our participation in the CLEF 2007 Ad-Hoc Bilingual task. We take a query translation based approach using bi-lingual dictionaries. Query words not found in the dictionary are transliterated using a simple lookup table based transliteration approach. The resultant transliteration is then compar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1509.04811 شماره
صفحات -
تاریخ انتشار 2015